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Abstract. In a group testing scheme, a set of tests is designed to identify a small number 
t of defective items among a large set (of size A^) of items. In the non-adaptive scenario 
the set of tests has to be designed in one-shot. In this setting, designing a testing scheme is 
equivalent to the construction of a disjunct matrix, an M x N matrix where the union of 
supports of any t columns does not contain the support of any other column. In principle, 
one wants to have such a matrix with minimum possible number M of rows (tests). One of 
the main ways of constructing disjunct matrices relies on constant weight error-correcting 
codes and their minimum distance. In this paper, we consider a relaxed definition of a 
disjunct matrix known as almost disjunct matrix. This concept is also studied under the 
name of weakly separated design in the literature. The relaxed definition allows one to 
come up with group testing schemes where a close-to-one fraction of all possible sets of 
defective items are identifiable. Our main contribution is twofold. First, we go beyond the 
minimum distance analysis and connect the average distance of a constant weight code to 
the parameters of an almost disjunct matrix constructed from it. Our second contribution 
is to explicitly construct almost disjunct matrices based on our average distance analysis, 
that have much smaller number of rows than any previous explicit construction of disjunct 
matrices. The parameters of our construction can be varied to cover a large range of rela- 
tions for t and A*'. As an example of parameters, consider any absolute constant e > and 
t proportional to , 5 > 0. With our method it is possible to explicitly construct a group 
testing scheme that identifies (1 — e) proportion of all possible defective sets of size t using 

only O (^'^^^ \/\og{N / e)^ tests. On the other hand, to form an explicit non-adaptive group 

testing scheme that works for all possible defective sets of size t, one requires 0{t^ log TV) 
tests. 
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1. Introduction 

Combinatorial group testing is an old and well-studied problem. In the most general 
form it is assumed that there is a set of N elements among which at most t are defective, 
i.e., special. This set of defective items is called the defective set or configuration. To find 
the defective set, one might test all the elements individually for defects, requiring N tests. 
Intuitively, that would be a waste of resource if t N. On the other hand, to identify 
the defective configuration it is required to ask at least log ^*^q (^) ~ ilog ^ yes-no 
questions. The main objective is to identify the defective configuration with a number of 
tests that is as close to this minimum as possible. 

In the group testing problem, a group of elements are tested together and if this particular 
group contains any defective element the test result is positive. Based on the test results of 
this kind one identifies (with an efficient algorithm) the defective set with minimum possible 
number of tests. The schemes (grouping of elements) can be adaptive, where the design of 
one test may depend on the results of preceding tests. For a comprehensive survey of 
adaptive group testing schemes we refer the reader to f9l. 

In this paper we are interested in non-adaptive group testing schemes: here all the tests 
are designed together. If the number of designed tests is M, then a non-adaptive group 
testing scheme is equivalent to the design of a so-called binary test matrix of size M x N 
where the (i, j)th entry is 1 if the ith test includes the jth element; it is otherwise. As the 
test results, we see the Boolean OR of the columns corresponding to the defective entries. 

Extensive research has been performed to find out the minimum number of required tests 
M in terms of the number of elements N and the maximum number of defective elements 
t. The best known lower bound says that it is necessary to have M = O(j^logA^) 
tests II101I12L The existence of non-adaptive group testing schemes with M = 0{t^ log N) 
is also known for quite some time |[9l lT71 . 

Evidently, there is a gap by the factor of O(logt) in these upper and lower bounds. It 
is generally believed that it is hard to close the gap. On the other hand, for the adaptive 
setting, schemes have been constructed with as small as O(tlogn) tests, optimal up to a 
constant factor |[9l[T5]|. 

A construction of group testing schemes from error-correcting code matrices and us- 
ing code concatenation appeared in the seminal paper by Kautz and Singleton US]- Code 
concatenation is a way to construct binary codes from codes over a larger alphabet [22 J. 
In lfT9l . the authors concatenate a g-ary {q > 2) Reed-Solomon code with a unit weight 
code to use the resulting codewords as the columns of the testing matrix. Recently in [28], 
an explicit construction of a scheme with M = 0{t^ log A^) tests is provided. The con- 
struction of |[28i is based on the idea of 1191 : instead of the Reed-Solomon code, they take a 
low -rate code that achieves the Gilbert- Varshamov bound of coding theory Il22[|29fl . Papers, 
such as 1 1 1 , 33], also consider construction of non-adaptive group testing schemes. 

In this paper we explicitly construct a non-adaptive scheme that requires a number of test 
proportional to t^/^. However, we needed to relax the requirement of identifications of de- 
fective elements in a way that makes it amenable for our analysis. This relaxed requirement 
schemes were considered under the name of weakly separated designs in [23] and [34]. Our 
definition of this relaxation appeared previously in the paper iflTl . We (and 1121 [12311341 ) aim 
for a scheme that successfully identifies a large fraction of all possible defective configura- 
tions. Non-adaptive group testing has found applications in multiple different areas, such as, 
multi-user communication |l3l[32l, DNA screening ll26l . pattern finding 120] etc. It can be 
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observed that in many of these apphcations it would have been still useful to have a scheme 
that identifies almost all different defective configurations if not all possible defective con- 
figurations. It is known (see, |34]) that with this relaxation it might be possible to reduce the 
number of tests to be proportional to t log A^. However this result is not constructive. The 
above relaxation and weakly separated designs form a parallel of similar works in compres- 
sive sensing (see, |T,'24]) where recovery of almost all sparse signals from a generic random 
model is considered. In the literature, other relaxed versions of the group testing problem 
have been studied as well. For example, in lfT4l it is assumed that recovering a large frac- 
tion of defective elements is sufficient. There is also effort to form an information-theoretic 
model for the group testing problem where test results can be noisy [2|. In other versions of 
the group testing problem, a test may carry more than one bit of information L4,J_6J, or the 
test results are threshold-based (see |6| and references therein). Algorithmic aspects of the 
recovery schemes have been studied in several papers. For example, papers lITSl and |[27l 
provide very efficient recovery algorithms for non-adaptive group testing. 

1.1. Results. The constructions of II19II28I and many others are based on so-called constant 
weight error-correcting codes, a set of binary vectors of same Hamming weight (number 
of ones). The group-testing recovery property relies on the pairwise minimum distance be- 
tween the vectors of the code |[T9|| . In this work, we go beyond this minimum distance 
analysis and relate the group-testing parameters to the average distance of the constant 
weight code. This allows us to connect weakly separated designs to error-correcting codes 
in a general way. Previously the connection between distances of the code and weakly sep- 
arated designs was only known for the very specific family of maximum distance separable 
codes [2T|, where much more information than the average distance is evident. 

Based on the newfound connection, we construct an explicit (constructible deterministi- 
cally in polynomial time) scheme of non-adaptive group testing that can identify all except 
an e > fraction of all defective sets of size at most t. To be specific, we show that it is 
possible to explicitly construct a group testing scheme that identifies (1 — e) proportion of 



e > 2(A^ — t)e~*. It can be seen that, with the relaxation in requirement, the number of 
tests is brought down to be proportional to t^/"^ from t^. This allows us to operate with a 
number of tests that was previously not possible in explicit constructions of non-adaptive 
group testing. For a large range of values of t, namely t being proportional to any posi- 
tive power of N, i.e., t ~ , and constant e our scheme has number of tests only about 
^t^/'^ ^J\og{NJ€)). Our construction technique is same as the scheme of |[T9ll28l . how- 
ever with a finer analysis relying on the distance properties of a linear code we are able to 
achieve more. 

In Section 121 we provide the necessary definitions and state one of the main results: we 
state the connection between the parameters of a weakly separated design and the average 
distance of a constant weight code. In Section |4] we discuss our construction scheme. The 
proofs of our claims can be found in Sections [3] and HI 

2. Disjunct Matrices 

2.1. Lower bounds. It is easy to see that, if an M x binary matrix gives a non-adaptive 
group testing scheme that identify up to t defective elements, then, ^*=o (T) — "^^^^ 



all possible defective sets of size t using only 8et^/^ log N 




tests for any 



log t-log log 



2{N-t) 



4 



A. MAZUMDAR 



means that for any group testing scheme, 



(1) 



i=0 ^ ' 



> tlog— . 



Consider the case when one is interested in a scheme that identifies all possible except an e 
fraction of the different defective sets. Then it is required that, 



Although ^ is proven to be a loose bound, it is shown in 1123 11341 that Q is tight. 

2.2. Disjunct matrices. The support of a vector x is the set of coordinates where the 
vector has nonzero entries. It is denoted by supp(a;). We use the usual set terminology, 
where a set A contains B if B A. 

Definition 1. An M x N binary matrix A is called t-disjunct if the support of any column 
is not contained in the union of the supports of any other t columns. 

It is not very difficult to see that a t-disjunct matrix gives a group testing scheme that 
identifies any defective set up to size t. On the other hand any group testing scheme that 
identifies any defective set up to size t must be a (t — l)-disjunct matrix [9|. To a great 
advantage, disjunct matrices allow for a simple identification algorithm that runs in time 
0{Nt). Below we define relaxed disjunct matrices. This definition appeared very closely 
in 12311311 and independently exacdy in ||2TI . 

Definition 2. For any e > 0, an M x N matrix A is called type-1 (t, e)-disjunct if the set of 
t-tuple of columns (of size {^)) has a subset B of size at least (1 — e) ( ^ ) with the following 
property: for all J & B, U^e J supp(k) does not contain support of any column u ^ J. 

In other words, the union of supports of a randomly and uniformly chosen set of t 
columns from a type-1 {t, e) -disjunct matrix does not contain the support of any other col- 
umn with probability at least 1 — e. It is easy to see the following fact. 

Proposition 1. A type-1 {t, e)-disjunct matrix gives a group testing scheme that can identify 
all but at most a fraction e > of all possible defective configurations of size at most t. 

The definition of disjunct matrix can be restated as follows: a matrix is t-disjunct if 
any t + I columns indexed by ii, . . . ,it+i of the matrix form a sub matrix which must 
have a row that has exactly one 1 in the ijth position and zeros in the other positions, for 
j = 1, . . . ,t + I. Recall that, a permutation matrix is a square binary {0, l}-matrix with 
exactly one 1 in each row and each column. Hence, for a t-disjunct matrix, any t + 1 
columns form a sub-matrix that must contain t + 1 rows such that a (t + 1) x (t + 1) 
permutation matrix is formed of these rows and columns. A statistical relaxation of the 
above definition gives the following. 

Definition 3. For any e > 0, an M x N matrix A is called type-2 (t, €)-disjunct if the set 
of (t + \)-tuples of columns (of size (^^j^) ) has a subset B of size at least (1 — e) {^^^ with 
the following property: the M x (t + 1) matrix formed by any element J & B must contain 
t + 1 rows that form a (t + 1) x (t + 1) permutation matrix. 



(2) 
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In Other words, with probabiUty at least 1 — e, any randomly and uniformly chosen t + 1 
columns from a type-2 (t, e) -disjunct matrix form a sub-matrix that must has t + 1 rows 
such that a (t + 1) x + permutation matrix can be formed. It is clear that for e = 0, 
the type-1 and type-2 {t, e)-disjunct matrices are same (i.e., t-disjunct). In the rest of the 
paper, we concentrate on the design of an M x matrix A that is type-2 {t, e)-disjunct. 
Our technique can be easily extended to the construction of type-1 disjunct matrices. 

2.3. Constant weight codes and disjunct matrices. A binary (M, N, d) code C is a set 
of size consisting of {0, l}-vectors of length M. Here d is the largest integer such that 
any two vectors (codewords) of C are at least Hamming distance d apart, d is called the 
minimum distance (or distance) of C. If all the codewords of C have Hamming weight w, 
then it is called a constant weight code. In that case we write C is an (M, A^, d, w) constant 
weight binary code. 

Constant weight codes can give constructions of group testing schemes. One just ar- 
ranges the codewords as the columns of the test matrix. Kautz and Singleton proved the 
following in ||191 . 

Proposition 2. An (Af, A^, d, w) constant weight binary code provides a t-disjunct matrix 



Proof. The intersection of supports of any two columns has size at most w — d/2. Hence if 
w > t{w — d/2), support of any column will not be contained in the union of supports of 
any t other columns. □ 

2.4. {t, e)-disjunct matrices from constant weight codes. We extend Prop. |2]to have one 
of our main theorems. However, to do that we need to define the average distance D of n 
code C\ 



Here dH{x, y) denotes the Hamming distance between x and y. 

Theorem 3. Suppose, we have a constant weight binary code C of size N, minimum dis- 
tance d and average distance D such that every codeword has length M and weight w. The 
test matrix obtained from the code is type-2 (t, e)-disjunct for the largest t such that. 



The proof of this theorem is deferred until after the following remarks. 

Remark: By a simple change in the proof of the Theorem [3l it is possible to see that the 
test matrix is type-1 (t, e)-disjunct if. 



for an absolute constant a. 

One can compare the results of Prop. [2] and Theorem [3] to see the improvement achieved 
as we relax the definition of disjunct matrices. This will lead to the final improvement on 
the parameters of Porat-Rothschild construction [|281 . as we will see in Section |4l 





holds. Here a is any absolute constant greater than or equal to \/2(l + t/{N — 1)). 
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3. Proof OF Theorem [3] 

This section is dedicated to the proof of Theorem |3] Suppose, we have a constant weight 
binary code C of size N and minimum distance d such that every codeword has length M 
and weight w. Let the average distance of the code be D. Note that this code is fixed: we 
will prove a property of this code by probabilistic method . 

Let us now chose + codewords randomly and uniformly from all possible (^^^) 
choices. Let the randomly chosen codewords are {ci, C2, . . . , Cf+i}. In what follows, we 
adapt the proof of Prop. [2] in a probabilistic setting. 

Define the random variables for z = 1, . . . , t + 1, = Yl]1=i ■,■ ( ^ '^h(^^''^. 



Clearly, is the maximum possible size of the portion of the support of Cj that is common 
to at least one of Cj,j = 1, . . . ,t + 1, j ^ i. Note that the size of support of Cj is w. 
Hence, as we have seen in the proof of Prop. |2l if Z* is less than w for alH = 1, . . . , t + 1, 
then the Af x (t + 1) matrix formed by the t + 1 codewords must contain t + 1 rows such 
that a (t + 1) X (t + 1) permutation matrix can be formed. Therefore, we aim to find the 
probability Pr(3i S {1, . . . , t + 1} : > w) and show it to be bounded above by e under 
the condition of the theorem. 

As the variable Z's are identically distributed, we see that, 

Pr(3i G {1, . . . , t + 1} : Z* > w) < (t + 1) Pr(Z^ > w). 

In the following, we will find an upper bound on Pr(Z^ > w). 
Define, 

Z, = E ( ( u; - ) I ( ci , c, ) , = 2 , 3 , . . . , i) . 

Clearly, Z, = e( " ^^^%^)) , and Z,+i = E*t2 " ^^^) = ZK 

Now, 

t+1 , , X t+1 

J=2 j=2 



where the expectation is over the randomly and uniformly chosen (t + 1) codewords from 
all possible {^-^ choices. Note, 

t+1 ^ t+1 



j=2 ii<i2<-<it+i Vt+i; j=2 

^ t+1 ^ t+1 N 



V'^^l-Kt+il ii+i^ j=2 ^ > j=2h=ii,^n 

l<l^m<t+l 

t+1 

i=2 

where the expectation on the last but one line is over a uniformly chosen pair of distinct 
random codewords of C. Hence, 

Zi < t{w - D/2). 
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We start with the lemma below. 

Lemma 4. The sequence of random variables Zi,i = 1, . . . ,t + 1, forms a martingale. 

The statement is true by construction. For completeness we present a proof that is de- 
ferred to Appendix [A] Once we have proved that the sequence is a martingale, we show that 
it is a bounded-difference martingale. 

Lemma 5. For any i = 2, . . . ,t + 1, 

t-i + 1 



\Zi-Zi^i\<{w-d/2){l+ . 

V i\ — I 

The proof is deferred to Appendix iBl 

Now using Azuma's inequality for martingale with bounded difference ||25]| . we have, 

Pr \Zt+i - Zi\ >u)< 2exp 



2{w-d/2YY!tM 

N-i 



where, Ci = 1 + Kr^- This implies. 



PT{\Zt+i\>u + t(w-D/2)] <2exp, ^ 
^ ' '2{w-d/2rEtlcl 

Setting, u = w — 1 — t{w — D/2), we have, 

^ ^r.^ \ I [w -l-tiw- Dl2)f 

Pr [Z^ > w - l] < 2exp ( - ^ ^ j^-r-Hr 

\ J- 2{w-d/2yEtlc^ 

Now, 

t+i 

1=2 

Hence, 

Pr(3i£{l,...,t + l}:Z->»;)<2(t + l)exp(- " ^ " ^(^ " -P/^))^ j ^ 

2tiw-d/2r{l + j^" 

when, 

w-l-t(w-D/2) 
d 2>w , ' -, 



and a is a constant greater than \/2(l + jj^^ ■ 



4. Construction 

As we have seen in Section |2j constant weight codes can be used to produce disjunct 
matiices. Kautz and Singleton |[T9|| gives a construction of constant weight codes that results 
in good disjunct matrices. In their construction, they start with a Reed-Solomon (RS) code, 
a g-ary error-correcting code of length q — l. For a detailed discussion of RS codes we refer 
the reader to the standard textbooks of coding theory f 22 11291. Next they replace the q-aiy 
symbols in the codewords by unit weight binary vectors of length q. The mapping from q- 
ary symbols to length-g unit weight binary vectors is bijective: i.e., it is — )• 100 . . . 0; 1 — )• 
010 . . . 0; . . . ; g — 1 — )• . . . 01. We refer to this mapping as (p. As a result, one obtains a set 
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of binary vectors of length q{q — 1) and constant weight q. The size of the resulting binary 
code is same as the size of the RS code, and the distance of the binary code is twice that of 
the distance of the RS code. 

4.1. Consequence of Theorem |3] in Kautz-Singleton construction. For a q-ary RS code 
of size and length q—1, the minimum distance is g — 1 — log^ 1 = q — logg N. Hence, 
the Kautz-Singleton construction is a constant-weight code with length M = q{q — 1), 
weight w = q — 1, size and distance 2{q — logg N). Therefore, from Prop. |2j we have a 
t-disjunct matrix with, 

_ q-l-l _ q-2 ^ qlogq ^ VMlogM 
~ q-l-q + logg N ~ iog^~/V^ ^ logiV 2 log iV 

On the other hand, note that, the average distance of the RS code is j^{q — 1)(1 — 
Hence the average distance of the resulting constant weight code from Kautz-Singleton 
construction will be 

2jV(g-l)2 
q{N-l) ■ 

Now, substituting these values in Theorem[3l we have a type-1 {t, e) disjunct matrix, where. 



a /^i^ 2(iV-t) ^ {q-t)logq _ (VM-t)logM 



log N 2 log iV 

Suppose t < \/M /2. Then, 

M{\nMf > Aah{lnNfln ^^^~ 

This basically restricts t to be about 0(\/M)- Hence, Theorem[3]does not obtain any mean- 
ingful improvement from the Kautz-Singleton construction except in special cases. 

There are two places where the Kautz-Singleton construction can be improved: 1) instead 
of Reed-Solomon code one can use any other q-aiy code of different length, and 2) instead 
of the mapping (j) any binary constant weight code of size q might have been used. For a 
general discussion we refer the reader to lU §7.4]. In the recent work ||28]| . the mapping 
(/) is kept the same, while the RS code has been changed to a g-ary code that achieve the 
Gilbert- Varshamov bound [|22l|29l . 

In our construction of disjunct matrices we follow the footsteps of II191I28II . However, we 
exploit some property of the resulting scheme (namely, the average distance) and do a finer 
analysis that was absent from the previous works such as 1281 . 



4.2. q-ary code construction. We choose g to be a power of a prime number and write 
q = pt, for some constant /? > 2. The value of /3 will be chosen later. Next, we construct a 
linear q-wy code of size A^, length Mq and minimum distance dq that achieves the Gilbert- 
Varshamov bound i22[|29L i.e., 

log„ N / do \ 

where hq is the g-ary entropy function defined by, 

\{x) = X \ogq 3: J^{l-x) logg . 
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Porat and Rothschild ||28l show that it is possible to construct in time 0{MqN) a q-ary 
code that achieves the Gilbert- Varshamov (GV) bound. To have such construction, they 
exploit the following well-known fact: a q-wy linear code with random generator matrix 
achieves the GV bound with high probability |29|. To have an explicit construction of such 
codes, a derandomization method known as the method of conditional expectation HI is 
used. In this method, the entries of the generator matrix of the code are chosen one-by-one 
so that the minimum distance of the resulting code does not go below the value prescribed 
by Q. For a detail description of the procedure, see |[28l . 

With the above construction with proper parameters we can have a disjunct matrix with 
the following property. 

Theorem 6. Suppose e > 2(t + l)e~'**/or some constant a > 1. It is possible to explicitly 
construct a type-2 {t, e)-disjunct matrix of size M x N where 



V 1 



2(t+l) 



In 



lnt-lnln^^+ln(4a) 

To prove this theorem we need the following identity implicit in |[28l . We present the 
proof here for completeness. 

Lemma 7. For any q > s, 

l-h,(l-lA) = -i-(h.^ + ^-l^ ' ^ 



slngV s q ) Vslng/ 
The proof of this is deferred to Appendix O Now we are ready to prove Theorem [6] 

Proof of Theorem^ We follow the Kautz-Singleton code construction. We take a linear q- 
ary code C' of length = size N and minimum distance dq = ^. Each g-ary symbol 
in the codewords is then replaced with a binary indicator vector of length q (i.e., the binary 
vector whose all entries are zero but one entry, which is 1) according to the map 0. As a 
result we have a binary code C of length M and size A^. The minimum distance of the code 
is d and the codewords are of constant weie htu; = = f.The average distance of this 
code is twice the average distance of the q-aiy code. As C is linear (assuming it has no 
all-zero coordinate), it has average distance equal to 

j=l j=o ^ ■' ^ 



N 



-M,(l - l/q), 



N 

where Aj is the number of codewords of weight j in C' . Here we use the fact that the 
average of the distance between any two randomly chosen codewords of a nontrivial linear 
code is equal to that of a binomial random variable 1.22 J . Hence the constant weight code C 
has average distance 

2N 

D= MJl-l/q). 

The resulting matrix will be {t, e)-disjunct if the condition of Theorem[3]is satisfied, i.e., 

, ^ M,-l-t(M,-^M,(l-l/g)) M,-l-|^(jV/g-l) 

dq> Mq , = Mq , 
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or if, dg > Ma > , " . 

To construct a desired g-ary code, we use the ideas of f28l where the expUcitly con- 
structed codes meet the Gilbert- Varshamov bound. It is possible to construct in time poly- 
nomial in A^, Mq, a q-wy code of length Mq, size N and distance dq when 

Mq ^Hm, 

Therefore, explicit polynomial time construction of a type-2 (t, e)-disjunct matrix will 
be possible whenever, 

l0g„iV / l-l/Mq-^ 

(4) <l-hq(l " 



^ Al. 2(m) 



'i ay tin 

Let us now use the fact that we have taken (7 = /3t to be a prime power for some constant 
13. Let us chose /3 > 2ea^/a + 1. 
Hence, 

; = , = -(say), 

for an absolute constant 7 . At this point, we see, 

q fit fi 

- = I > ^ > 2e, 

from the condition on e and the values of /3, 7. Now, using Lemma|71 the right hand side of 
Eqn. dljl equals to 

1 / « s \ / 1 \ ln2- 1-0(1) 
'ln- + --l -o— — > ^' 



slnqK s q J \slnqJ sing 

Then explicit polynomial time construction of a type-2 {t, e)-disjunct matrix will be possi- 
ble whenever. 

In , ^ -1-0(1) 



Mq 



or. 



IniV 7V*ln^ln(/3t) 



ln(/..)ln-^-l-o(l) 



3/2 



i(lnt-lnln^i^) +ln^ - 1-0(1)' 
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The condition on e and the value chosen for /3 ensure that the denominator is strictly posi- 
tive. Hence it suffices to have, 



Int - In In + ln(4a) - o(l) ' 

□ 

Note that the implicit constant in Theorem [6] is proportional to ^/a. We have not partic- 
ularly tried to optimize the constant. However even then the value of the constant is about 

Remark: As in the case of Theorem |3l with a simple change in the proof, it is easy to see 
that one can construct a test matrix that is type-1 (i, e)-disjunct if, 



M = of t^/^ In - 



lnt-lnln^^^-Mn(4a) 

for any e > 2{N — t)e~"*, and a constant a. 

It is clear from Prop.[T]that a type-1 (t, e) disjunct matrix is equivalent to a group testing 
scheme. Hence, as a consequence of Theorem [6] (specifically, the remark above), we will 
be able to construct a testing scheme with 



log t — log log 



tests. Whenever the defect-model is such that all the possible defective sets of size t are 
equally likely and there are no more than t defective elements, the above group testing 
scheme will be successful with probability at lease 1 — e. 

Note that, if t is proportional to any positive power of N, then log and log t are of same 
order. Hence it will be possible to have the above testing scheme with 0{t^/'^ yJ\og{N / e)) 
tests, for any e > 2(iV - i)e"*. 

5. Conclusion 

In this work we show that it is possible to construct non-adaptive group testing schemes 
with small number of tests that identify a uniformly chosen random defective configuration 
with high probability. To construct a t-disjunct matrix one starts with the simple relation 
between the minimum distance d of a constant zii- weight code and t. This is an example of 
a scenario where a pairwise property (i.e., distance) of the elements of a set is translated 
into a property of t-tuples. 

Our method of analysis provides a general way to prove that a property holds for almost 
all t-tuples of elements from a set based on the mean pairwise statistics of the set. Our 
method will be useful in many areas of applied combinatorics, such as digital fingerprinting 
or design of key-distribution schemes, where such a translation is evident. For example, 
with our method new results can be obtained for the cases of cover-free codes II13[|19[[3TI . 
traceability and frameproof codes |l7l[30|. This is the subject of our ongoing work. 
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Appendix A. Proof of Lemma H] 

We have created a sequence here that is a martingale by construction. This is a standard 
method due to Doob ||8]|25l. Let, 

dH{ci,Cj) 

w — = Yj. 

Consider the cr-algebras Fk, k = . . . ,t + 1, where Fq = {0, [N\] and Fk is generated 
by the partition of the set of (^^^) possible choices for (t + l)-sets into (^) subsets with 
the fixed value of the first k indices, 1 < k < t -V 1. The sequence of increasingly refined 
partitions Fi <Z F2 <Z • • • C Ft+i forms a filtration such that Zk is measurable with respect 
to Fk (is constant on the atoms of the partition). 
We have, 

t+i 

Zi = E(^y, |y2,...,i^i) 

i t+1 

= Y.^^+^{ E ^.1^2,...,!^.) 

i=2 j=i+l 

t+l t+1 

= Zi_i + y, + E( I y2,...,i^i) -iE(5^y, I y2,...,>^i-i). 

j=i+l j=i 

We then have, 

t+1 

j=i+i 
t+1 

-E(E(j^y,- 1 y2,...,y,_i) | 

j=i 
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t+1 



f+1 



j=i+l j=i 

= Zi-i. 



Appendix B. Proof of Lemma [5] 
Let us again assume that, 

dH{ci,Cj] 



We have, 



w 



t+i t+i 

= |e( y,- 1 y2, . . . , li) - e( y,- 1 ^2, . . . , >^-l 



i=2 



i=2 



t+1 t+1 

< max |E(^y,- |y2,...,yi = a) -E(^y,- |y2,...,y,_i,yi = 6' 

<S/2 ^=2 



max 

0<a,fe 
<to-d/2 



max 

0<a,fe 
<w-d/2 



< max 

0<a,6 
<to-d/2 



t+1 



^ (E(y, I y2, . . . , y^ = a) - E(y, I y2, . . . , y^.i, y^ = h 



t+1 



6 + I ys, . . . ,yi = a) - E(^yj | ya, . . . .y^ = 6 

j=i+i 



t+1 



d/2+ ^ \¥.[w 



dH{ci,Cj) 



dnici, C2), . . . , dnici, Ci) = 2{w - a] 



< 



E\w - 

t+1 



j=i+i 
dnicijCj) 



I dnici, C2), . . . , dnici, Cj) = 2{w - b) 



w 



j=i+i 

t-i + l 
N-i 



w - d/2) 



(z-1) 



{w-d/2)(l + 
{w - d/2)ci, 



where a = 1 + 



t-i+l 
N-i ■ 



Appendix C. Proof of Lemma |7] 
Proof. The proof is straight-forward and uses the following approximation: 

Inx — ln(x — 1) = — h o( — 
X \x 
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We have, 

1 - hq(l - l/s) = J- (^1 - (^(In q - \u{q - 1)) - (In s - ln(s - 1))) 
sinq s 

1/1 1 1 1 1 «\ / 1 /I 1 

= — + ^ + -ln^ -o — 

mq\q qs s s s/ \mq\s q 



slnq\q sJ \s\n.q 



□ 



